110 research outputs found

    Evaluating Anytime Algorithms for Learning Optimal Bayesian Networks

    Get PDF
    Exact algorithms for learning Bayesian networks guarantee to find provably optimal networks. However, they may fail in difficult learning tasks due to limited time or memory. In this research we adapt several anytime heuristic search-based algorithms to learn Bayesian networks. These algorithms find high-quality solutions quickly, and continually improve the incumbent solution or prove its optimality before resources are exhausted. Empirical results show that the anytime window A* algorithm usually finds higher-quality, often optimal, networks more quickly than other approaches. The results also show that, surprisingly, while generating networks with few parents per variable are structurally simpler, they are harder to learn than complex generating networks with more parents per variable.Peer reviewe

    Steroid-sensitive nephrotic syndrome candidate gene CLVS1 regulates podocyte oxidative stress and endocytosis

    Get PDF
    We performed next-generation sequencing in patients with familial steroid-sensitive nephrotic syndrome (SSNS) and identified a homozygous segregating variant (p.H310Y) in the gene encoding clavesin-1 (CLVS1) in a consanguineous family with 3 affected individuals. Knockdown of the clavesin gene in zebrafish (clvs2) produced edema phenotypes due to disruption of podocyte structure and loss of glomerular filtration barrier integrity that could be rescued by WT CLVS1 but not the p.H310Y variant. Analysis of cultured human podocytes with CRISPR/Cas9-mediated CLVS1 knockout or homozygous H310Y knockin revealed deficits in clathrin-mediated endocytosis and increased susceptibility to apoptosis that could be rescued with corticosteroid treatment, mimicking the steroid responsiveness observed in patients with SSNS. The p.H310Y variant also disrupted binding of clavesin-1 to α-tocopherol transfer protein, resulting in increased reactive oxygen species (ROS) accumulation in CLVS1-deficient podocytes. Treatment of CLVS1-knockout or homozygous H310Y-knockin podocytes with pharmacological ROS inhibitors restored viability to control levels. Taken together, these data identify CLVS1 as a candidate gene for SSNS, provide insight into therapeutic effects of corticosteroids on podocyte cellular dynamics, and add to the growing evidence of the importance of endocytosis and oxidative stress regulation to podocyte function

    A Bounded Error, Anytime, Parallel Algorithm for Exact Bayesian Network Structure Learning

    Get PDF
    Abstract Bayesian network structure learning is NP-hard. Several anytime structure learning algorithms have been proposed which guarantee to learn optimal networks if given enough resources. In this paper, we describe a general purpose, anytime search algorithm with bounded error that also guarantees optimality. We give an efficient, sparse representation of a key data structure for structure learning. Empirical results show our algorithm often finds better networks more quickly than state of the art methods. They also highlight accepting a small, bounded amount of suboptimality can reduce the memory and runtime requirements of structure learning by several orders of magnitude

    Empirical Hardness of Finding Optimal Bayesian Network Structures: Algorithm Selection and Runtime Prediction

    Get PDF
    Various algorithms have been proposed for finding a Bayesian network structure that is guaranteed to maximize a given scoring function. Implementations of state-of-the-art algorithms, solvers, for this Bayesian network structure learning problem rely on adaptive search strategies, such as branch-and-bound and integer linear programming techniques. Thus, the time requirements of the solvers are not well characterized by simple functions of the instance size. Furthermore, no single solver dominates the others in speed. Given a problem instance, it is thus a priori unclear which solver will perform best and how fast it will solve the instance. We show that for a given solver the hardness of a problem instance can be efficiently predicted based on a collection of non-trivial features which go beyond the basic parameters of instance size. Specifically, we train and test statistical models on empirical data, based on the largest evaluation of state-of-the-art exact solvers to date. We demonstrate that we can predict the runtimes to a reasonable degree of accuracy. These predictions enable effective selection of solvers that perform well in terms of runtimes on a particular instance. Thus, this work contributes a highly efficient portfolio solver that makes use of several individual solvers.Peer reviewe

    Empirical evaluation of scoring functions for Bayesian network model selection

    Get PDF
    In this work, we empirically evaluate the capability of various scoring functions of Bayesian networks for recovering true underlying structures. Similar investigations have been carried out before, but they typically relied on approximate learning algorithms to learn the network structures. The suboptimal structures found by the approximation methods have unknown quality and may affect the reliability of their conclusions. Our study uses an optimal algorithm to learn Bayesian network structures from datasets generated from a set of gold standard Bayesian networks. Because all optimal algorithms always learn equivalent networks, this ensures that only the choice of scoring function affects the learned networks. Another shortcoming of the previous studies stems from their use of random synthetic networks as test cases. There is no guarantee that these networks reflect real-world data. We use real-world data to generate our gold-standard structures, so our experimental design more closely approximates real-world situations. A major finding of our study suggests that, in contrast to results reported by several prior works, the Minimum Description Length (MDL) (or equivalently, Bayesian information criterion (BIC)) consistently outperforms other scoring functions such as Akaike\u27s information criterion (AIC), Bayesian Dirichlet equivalence score (BDeu), and factorized normalized maximum likelihood (fNML) in recovering the underlying Bayesian network structures. We believe this finding is a result of using both datasets generated from real-world applications rather than from random processes used in previous studies and learning algorithms to select high-scoring structures rather than selecting random models. Other findings of our study support existing work, e.g., large sample sizes result in learning structures closer to the true underlying structure; the BDeu score is sensitive to the parameter settings; and the fNML performs pretty well on small datasets. We also tested a greedy hill climbing algorithm and observed similar results as the optimal algorithm
    corecore